Pesquisa | Portal Regional da BVS

Refining breast cancer biomarker discovery and drug targeting through an advanced data-driven approach.

Rakhshaninejad, Morteza; Fathian, Mohammad; Shirkoohi, Reza; Barzinpour, Farnaz; Gandomi, Amir H.

BMC Bioinformatics ; 25(1): 33, 2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38253993

RESUMO

Breast cancer remains a major public health challenge worldwide. The identification of accurate biomarkers is critical for the early detection and effective treatment of breast cancer. This study utilizes an integrative machine learning approach to analyze breast cancer gene expression data for superior biomarker and drug target discovery. Gene expression datasets, obtained from the GEO database, were merged post-preprocessing. From the merged dataset, differential expression analysis between breast cancer and normal samples revealed 164 differentially expressed genes. Meanwhile, a separate gene expression dataset revealed 350 differentially expressed genes. Additionally, the BGWO_SA_Ens algorithm, integrating binary grey wolf optimization and simulated annealing with an ensemble classifier, was employed on gene expression datasets to identify predictive genes including TOP2A, AKR1C3, EZH2, MMP1, EDNRB, S100B, and SPP1. From over 10,000 genes, BGWO_SA_Ens identified 1404 in the merged dataset (F1 score: 0.981, PR-AUC: 0.998, ROC-AUC: 0.995) and 1710 in the GSE45827 dataset (F1 score: 0.965, PR-AUC: 0.986, ROC-AUC: 0.972). The intersection of DEGs and BGWO_SA_Ens selected genes revealed 35 superior genes that were consistently significant across methods. Enrichment analyses uncovered the involvement of these superior genes in key pathways such as AMPK, Adipocytokine, and PPAR signaling. Protein-protein interaction network analysis highlighted subnetworks and central nodes. Finally, a drug-gene interaction investigation revealed connections between superior genes and anticancer drugs. Collectively, the machine learning workflow identified a robust gene signature for breast cancer, illuminated their biological roles, interactions and therapeutic associations, and underscored the potential of computational approaches in biomarker discovery and precision oncology.

Assuntos

Biomarcadores Tumorais , Neoplasias da Mama , Humanos , Feminino , Biomarcadores Tumorais/genética , Medicina de Precisão , Algoritmos , Sistemas de Liberação de Medicamentos , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética

A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data.

Nouri-Moghaddam, Babak; Ghazanfari, Mehdi; Fathian, Mohammad.

Neural Comput Appl ; 35(16): 11531-11561, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-34539088

RESUMO

Microarray technology is known as one of the most important tools for collecting DNA expression data. This technology allows researchers to investigate and examine types of diseases and their origins. However, microarray data are often associated with a small sample size, a significant number of genes, imbalanced data, etc., making classification models inefficient. Thus, a new hybrid solution based on a multi-filter and adaptive chaotic multi-objective forest optimization algorithm (AC-MOFOA) is presented to solve the gene selection problem and construct the Ensemble Classifier. In the proposed solution, a multi-filter model (i.e., ensemble filter) is proposed as preprocessing step to reduce the dataset's dimensions, using a combination of five filter methods to remove redundant and irrelevant genes. Accordingly, the results of the five filter methods are combined using a voting-based function. Additionally, the results of the proposed multi-filter indicate that it has good capability in reducing the gene subset size and selecting relevant genes. Then, an AC-MOFOA based on the concepts of non-dominated sorting, crowding distance, chaos theory, and adaptive operators is presented. AC-MOFOA as a wrapper method aimed at reducing dataset dimensions, optimizing KELM, and increasing the accuracy of the classification, simultaneously. Next, in this method, an ensemble classifier model is presented using AC-MOFOA results to classify microarray data. The performance of the proposed algorithm was evaluated on nine public microarray datasets, and its results were compared in terms of the number of selected genes, classification efficiency, execution time, time complexity, hypervolume indicator, and spacing metric with five hybrid multi-objective methods, and three hybrid single-objective methods. According to the results, the proposed hybrid method could increase the accuracy of the KELM in most datasets by reducing the dataset's dimensions and achieve similar or superior performance compared to other multi-objective methods. Furthermore, the proposed Ensemble Classifier model could provide better classification accuracy and generalizability in the seven of nine microarray datasets compared to conventional ensemble methods. Moreover, the comparison results of the Ensemble Classifier model with three state-of-the-art ensemble generation methods indicate its competitive performance in which the proposed ensemble model achieved better results in the five of nine datasets. Supplementary Information: The online version contains supplementary material available at 10.1007/s00521-021-06459-9.

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA